Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: cdu Date: Dec 20, 2003 6:02pm
Forum: millionbooks Subject: djvu vs djvulibre and texts in 1m books


Hi.

I was visiting archive.org (in person) last friday to help with making books and I asked a fellow there about the use of djvu to view and read texts and he pointed out that djvu is able to do OCR to the images in addition to being a fancy bitmap viewer. I was quite impressed. I went and tried this at home and it turns out that the version of the djvu tools (DjVuLibre-3.5.12) I had does not have the OCR functionality. I was sad.

The documentation on your site, in the faq, says:

... This file will also be ocr'd to make the text searchable.( /djvu/bin/documenttodjvu --filelist.txt temp.djvu, /djvu/bin --ocr aatttt.djvu)


Does windows version of this software have the ocr functionality in it? Would it be possible for you to ocr the files on your site so those of us without the commercial version of the djvu tools can read the text without looking at the bitmaps?


This is some fascinating stuff and I'm pretty amazed at it all. Thanks for putting it all together.
chris